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Summary 

The Underwater Explosions Research Division (UERD) of 
David Taylor Naval Ship Research and Development Center (DTNSRDC) 
makes extensive use of NASTRAN/COSMIC on a CDC 176 to evaluate 
the structural response of ship structures subjected to 
underwater explosion shock loadings in the time domain. As 
relatively new users, UERD research engineers have encountered 
many problems on various levels during the analysis process and 
have found it necessary to utilize the checkpoint/restart feature 
of NASTRAN/COSMIC. As the USER'S Manual is vague on the subject 
of checkpoints/restarts, a set of working procedures were 
developed for the implementation of the checkpoint/restart 
feature in the transient analysis (Rigid Format # 9) of single 
stage structural models and multi-stage substructure models. 

These working procedures are the subject of this paper. Examples 
are illustrated in the Appendix to highlight these procedures for 
a CDC 176 computer. 


Introduction 

NASTRAN/COSMIC was designed to run large problems usually 
requiring lengthy execution times and/or large memory 
allocations. User errors are common. Operator, hardware, or 
system failures resulting in the abnormal termination of a 
problem are not entirely uncommon, even with the best of computer 
systems. Due to machine and code dependent parameters, the 
termination of a run because of exceeded time and/or memory 
allocations is quite possible. 

To prevent costly loss of information generated 
immediately prior to the point of termination and/or to allow 
added flexibility as well as efficiency in the solution process, 
the user is encouraged to utilize the checkpoint/restart feature 
available in NASTRAN/COSMIC. 

The checkpoint/restart feature of NASTRAN/COSMIC was 
designed to allow the user to checkpoint a NASTRAN run and later 
restart it by executing only those modules needed for completion 
of the solution. The restart deck submitted to NASTRAN may 
include corrections to erroneous or omitted data in the original 
checkpoint run, additional data entries, or may simply consist of 
the original data deck in cases where the program terminated 
abnormally due to a system failure. Unfortunately, the NASTRAN 
User's Manual is not entirely clear on the procedures for 
checkpointing and restarting problems, particularly those for 
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multi-stage substructure model analysis. Hence, a few points in 
the procedure are worthy of discussion. 


Checkpoint Procedure 

The checkpoint/restart feature is applicable to the 
analysis of single stage structural models as well as multi-stage 
substructure models, and the procedure for checkpointing the 
problems is identical. 

Outlined in Example 1 of the Appendix is a listing of the 
job and executive control decks for a single stage model and a 
multi-stage substructure model analysis problem. These examples 
are accompanied by a brief explanation of the additional and 
pertinent commands necessary for the execution of the checkpoint 
procedure on a CDC 176 computer. 

The user may find it difficult to predict the memory and 
time requirements of large problems without some knowledge 
developed through experience. There are methods available which 
are somewhat dependable for the estimation of these parameters 
[!]• However, these methods can not guarantee the successful 
completion of the job, nor, in some cases, are these procedures 
easily implemented. If the problem warrants it, the time and 
memory limits may be set at the maximum values [2], but doing so 
has the trade-off of changing the priority of the job. This will 
be accompanied by a delay in the execution of the problem. In 
many instances at the DTNSRDC computer facility it is wise to 
schedule blocktime [2] for the solution of large problems, thus 
reducing the cost of execution. However, the job is still 
dependent on the system whose failures are not easily controlled 
or anticipated. 

In the case of substructure modeling, the user has the 
option of checkpointing a Phase One run for subsequent Phase 
Three restarts. It is UERD's experience that in most cases this 
is neither economical nor advantageous. The disadvantage is that 
it requires more computer time to execute the checkpointed Phase 
One run, more cost due to storing the large new problem tape 
(NPTP) and requires more computer time due to increased I/O in 
the Phase Three run. 


Restart Procedure 

Assuming the checkpointed solution run terminated 
abnormally due to one of the afore mentioned conditions, and both 
the "NPTP" and "PUN” files were created and successfully stored, 
recovery of the job consists of a few simple modifications to the 
original input deck and resubmitting it for execution. The 
modifications consist of: 1) attaching the "NPTP", 2) merging the 
restart dictionary into the executive control deck, 3) making any 
corrections to the case control or bulk data decks and 4) 
including an "ALTER” if the problem is a substructure model 
analys is . 
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Outlined in Examples 2 and 3 of the Appendix are listings 
of the job and executive control decks required for the 
unmodified restart (no changes) of single stage structural model 
and multi-stage substructure model analysis problems. Each of 
these examples are followed by a brief description of the 
additional and pertinent commands. 

If the termination is due to error (s) in the case control 
or bulk data, the effective changes should be included in the 
restart run. For case control errors, the correction is included 
or replaces the erroneous command. If errors in the bulk data 
exist, only the corrections need to be included and the rest of 
the bulk data deck is omitted. Adjustments to time and memory 
limits may be required depending on their values in the 
checkpointed run and the point at which the job terminated in the 
solution sequence. The point of termination is determined by 
examining the dayfile messages or by inspecting the checkpointed 
DMAP sequence list which appears in the PUN file or restart 
dictionary (the last sequence reentered is the point of 
termination) . 

For most large problems in which time or memory 
allocations were insufficient, the program may terminate in the 
dynamic loop. For substructure analysis problems which terminate 
within the dynamic loop, restarting the problem requires the 
addition of an ALTER statement which enters the DMAP sequence 
immediately after the last sequence checkpointed (see Example 3, 
statement 20) . The purpose of this ALTER is to regenerate 
substructure control deck information required for recovery of 
the solution vector which is not checkpointed by NASTRAN/COSMIC . 
This problem occurs at executive decision making levels in the 
solution process and can only be remedied by the inclusion of an 
ALTER statement at this time. Future code changes may correct 
this problem. 


Concluding Remarks 

This paper illustrates working procedures for application 
of the checkpoint/restart feature to the transient analysis using 
NASTRAN/COSMIC. The importance of the substructure modeling 
technique has grown in proportion to the growth in complexity of 
problems UERD research engineers are tasked to solve. Just as the 
complexity of problems increases, so does the need for flexible 
and efficient solution techniques. The checkpoint/restart feature 
of NASTRAN/COSMIC was designed to accomplish this objective. 
Following the procedures illustrated in this paper will aid new 
users to become more proficient in the use of this powerful tool. 
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APPENDIX 


Example # 1 : Checkpointing of direct transient analysis of 
single stage or multi-stage structural models. 

1. CS' s ~,CM260000,T50,P3. 

2. CHARGE , CS ~ ~ , XXXXXXXXXX . 

3. LIMIT, 7777. 

4. REQUEST, OUT, *PF. 

5. REQUEST, NPTP, *PF. 

6. REQUEST, PUN, *PF. 

7. MSACCES , . 

8. ATTACH, NAS TRAN. 

9. BEGIN, NASTRAN,NASTRAN, 260000, , OUT, PUN. 

10. EXIT , U . 

11. CATALOG, OUT, OUTPUT, ID=CS''*. 

12. CATALOG , NPTP , ID=CS'''' . 

1 3 . CATALOG , PUN , RESTDICTNRY , ID=CS * “ . 

14. EOR 

15. NASTRAN TITLE0PT=-2 , SYSTEM (71 ) =1 , FILES=NPTP ?? 

16. ID FOREMAST, ANALYSIS 

17. APP DISP 

18. SOL 9,0 

19. TIME 50 

20. DIAG 8,14,22 

21. CHKPNT YES 

22. CEND 

23. TITLE 

* 

* 

ETC. 

*Note : CASE CONTROL AND BULK DATA DECKS AS USUAL. 


Description of Commands 

1. The central memory "CM260000" and total job time "T50" 
resources allocated here may be the cause of an abnormal 
termination and may need to be adjusted for the restart run 
depending on the size of the problem and how much of the solution 
was completed [2], Close inspection of the dayfile and output 
messages may show the reason (s) and point of termination. When 
the job aborts due to exceeding the CPU time limit, the system 
will allow time to catalog and unload files, so the "NPTP" and 
"PUN" files will be available for restart. 

3. The amount of mass storage which may be used at one time is 
specified via the "LIMIT" card [2]. If the mass storage is 
inadequate the job will terminate. Again, the need to increase 
this parameter will be determined by the size of the problem and 
how much of the solution was completed prior to termination. 
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5.. 6. To catalog these files, which are essential to restart the 
problem, they must be requested as permanent files. The "NPTP" 

(new problem tape) is the file that contains the information 
generated prior to termination needed to complete the solution. 
The "PUN" (punch output file) is a file containing the checkpoint 
dictionary (a complete listing of all DMAP sequences that were 
executed and checkpointed) . The checkpoint dictionary must be 
edited to remove all "EOR" messages that appear, then merged in 
the executive control deck of the restart run. The creation of 
these files is mandatory. 

9. The files "OUT" and "PUN" should appear as parameters in 
this statement for definition and creation. 

12.. 13. The files "NPTP" and "PUN" must be catalogued for 
retrieval . 

15. Among the parameters utilized, the user is urged to set 
"SYSTEM (71 ) " to "1", which will suppress some of the information 
routed to the dayfile. This information is, however, printed in 
the output file. This reduces the chance of the program 
terminating due to exceeding the dayfile message limit but still 
provide the information which may be useful in tracking other 
errors. The "NPTP" must be specified as an executive file via 
"FI LES=NPTP" on the NASTRAN card. 

16. The problem ID should be specified as per instructions in 
the User's Manual for the checkpoint run [3]. This is due to the 
format requirements of the restart card in the checkpoint 
dictionary. Incorrect format will cause difficulty in the restart 
process . 

19. This command specifies the maximum time allotted to NASTRAN 
for problem execution. If the amount specified is inadequate the 
job will terminate, producing fatal error messages in the output. 
The user will then be required to submit a restart deck to 
recover the job. The time may need to be increased upon restart 
depending on the point of termination. The time required for 
NASTRAN execution is less than the total job time required. 

21. This command initiates the checkpoint process. It is 
mandatory for checkpointing the problem [3] . 
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Example # 2 :Unmodified restart of direct transient analysis of 
single stage structural model 


XV • 
11 . 
12 . 

13. 

14. 

15. 

16. 

17. 

18. 


CS~~,CM260000,T500,P3. 

CHARGE , CS , XXXXXXXXXX . 

LIMIT, 7777. 

REQUEST, OUT, *PF. 

MSACCES , XXXXX . 

ATTACH, NASTRAN. 

ATTACH, OPTP,NPTP, ID=CS~~ . 

BEGIN , NASTRAN , NASTRAN , 260000 , , OUT . 

EXIT , U . 

CATALOG, OUT, OUTPUT, ID=CS~. 

EOR 

NASTRAN TITLE0PT=-2 , SYSTEM ( 7 1 ) =1 , FI LES=0PTP 

ID A1234567,B7654321 

APP DISP 

SOL 9,0 

TIME 50 

DIAG 8,14,22 

RESTART A1234567,B7654321 , 9/17/86, 48570, 

1, XVPS , FLAGS = 0, REEL = 1, FILE = 5 

2, REENTER AT DMAP SEQUENCE NUMBER 6 

3, GPL , FLAGS = 0, REEL = 1, FILE = 6 

4, EQEXIN , FLAGS = 0, REEL = 1, FILE = 7 

5, GPDT , FLAGS = 0, REEL = 1, FILE = 8 

6, CSTM , FLAGS = 0, REEL = 1, FILE = 9 

7, BGPDT , FLAGS = 0, REEL = 1, FILE = 10 

8, SIL , FLAGS = 0, REEL = 1, FILE = 11 

9, XVPS , FLAGS = 0, REEL = 1, FILE = 12 

10, REENTER AT DMAP SEQUENCE NUMBER 7 

11, BGPDT , FLAGS = 0, REEL = 1, FILE = 13 


203, 

TOL 

/ 

FLAGS = 0, 

REEL = 1, 

FILE = 85 

204, 

XVPS 

t 

FLAGS = 0, 

REEL = 1, 

FILE = 86 

205, 

REENTER 

AT 

DMAP SEQUENCE 

NUMBER 125 


206, 

PDT 

9 

FLAGS = 0, 

REEL = 1, 

FILE = 87 

207, 

XVPS 

9 

FLAGS = 0, 

REEL = 1, 

FILE = 88 


$ END OF CHECKPOINT DICTIONARY 

19. CEND 

20. TITLE = 

* 

* 

ETC. 

*Note : IF NO EFFECTIVE CHANGES, CASE CONTROL DECK IDENTICAL 
TO CHECKPOINT RUN AND BULK DATA DECK OMITTED. 
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Description of Commands 


1. This particular problem is the restart of the checkpointed 
previous example. The problem aborted because the time allocated 
(T50 ) was not sufficient for completion of the run. Although the 
checkpoint dictionary indicates that the program was almost 
completed (the last sequence reentered was 125) , the time limit 
was extended to "T500" for restart. This amount would have been 
sufficient for the run to be completed in the checkpoint phase. 

3. This command remains unchanged. 

7. The "NPTP" is made available to NASTRAN via this statement, 
and must be renamed as the "OPTP" (old problem tape). This 
command is mandatory for restarting the problem since the 
NPTP/OPTP contains the information required by NASTRAN to 
continue the solution sequence. 

12. As previously mentioned, "SYSTEM (71 ) 11 is set to "l" and the 
"OPTP" is specified as an executive file. 

16. The NASTRAN execution time, "TIME 50", is deemed adequate 
and remains unchanged. 

18. This is a partial listing of the checkpoint/restart 
dictionary which is contained in the PUN file. The first card 
shown is the restart card [3]. This card identifies the problem 
as a restarted job. The first entry is the ID of the checkpointed 
problem. This entry is compared to the NPTP/OPTP to verify that 
it corresponds to the problem being restarted. The cards which 
follow indicate the DMAP modules which were executed and 
checkpointed. As can be seen from this deck, the last 
successfully completed and checkpointed sequence was DMAP module 
124. Number 125 was reentered, but checkpointing was not 
completed. Therefore, sequence number 125 is the point at which 
NASTRAN will pick up the solution process. This complete file is 
mandatory. 

Note: The case control deck is required for restarting the job, 
but the bulk data deck may be omitted if there are no changes. 
Also, a restart run may be checkpointed as any other problem 
which is eligible for checkpointing, but the user should weigh 
the benefits of doing so to keep computer costs minimal. 
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Example # 3 : Unmodified restart of direct transient analysis of 
multi-stage substructure model 


1. CS~~,CM377700,T400,P3. 

2. CHARGE , CS~ ~ , XXXXXXXXXX . 

3. LIMIT, 10000. 

4. REQUEST, OUT, *PF. 

5. MSACCES , XXXXX. 

6. MS FETCH , OPTP , ID=CS ~ ~ . 

7. ATTACH, NASTRAN. 

8. ATTACH, SOFA, ID=CS''~ . 

9. BEGIN, NASTRAN, NASTRAN, 377700, , OUT, PUN. 

10. EXIT, U. 

11. CATALOG, OUT, TESTOUT, I D=CS~'\ 

12. EXTEND, SOFA. 

13. EOR 

14. NASTRAN TITLE0PT=-2 , SYSTEM (71 ) =1 , FILES=OPTP 

15. ID AFTMODEL, ANALYSIS 

16. APP DISP, SUBS 

17. SOL 9,0 

18. TIME 150 

19. RESTART AFTMODEL , ANALYSIS , 8/21/86, 76863, 



1, 

XVPS 

9 

FLAGS = 0, 

REEL 

= 

i. 

FILE 

= 

6 


2, 

REENTER 

AT 

DMAP SEQUENCE 

NUMBER 


6 





3, 

GPL 

r 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

7 


4, 

EQEXIN 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 


8 


5, 

GPDT 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

9 


6 , 

BGPDT 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

10 


7, 

SIL 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

11 


8, 

GE3S 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

12 


9, 

GE4S 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

13 


10, 

DYNS 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

14 


11, 

XVPS 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

15 


12, 

REENTER 

AT 

DMAP SEQUENCE 

NUMBER 


5 





13, 

XVPS 

9 

FLAGS = 0, 

m 

• 

REEL 


1, 

FILE 


16 


203, 

REENTER 

AT 

• 

• 

• 

DMAP SEQUENCE 

NUMBER 


125 





204, 

PDT 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

75 


205, 

XVPS 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

76 


206, 

REENTER 

AT 

DMAP SEQUENCE 

NUMBER 


126 





207, 

UDVT 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

77 


208, 

XVPS 

9 

FLAGS = 0, 

REEL 

= 

1, 

FILE 

= 

78 


$ END 

OF CHECKPOINT 

DICTIONARY 







20. 

ALTER 

126 $ 









21. 

SGEN 

CASECC 

GE0M3, GE0M4, DYNAMICS/CASESS , 

CASE I 

9 



DUMA1 , 

DUMA2, DUMA3 

, DUMA4, DUMA5, DUMA 6 

/ 

DUMA7 

9 




DUMA8 /I /* TOTAL* /DUML/DUMN $ 







22. 

ENDALTER 









23. 

DIAG 8 

,14,22 
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24. CEND 

25. SUBSTRUCTURE PHASE 2 

26. PASSWORD=XXXXXX 

27. S0F(1)=S0FA, 9000 

28. OPTIONS K,M,P 

29. SOFPRINT TOC 

30. SOLVE TOTAL 

31. RECOVER TOTAL 

32. PRINT MARTIN 

33. SOFPRINT TOC 

34. ENDSUBS 

35. TITLE= 

* 

* 

ETC. 


*Note : IF NO EFFECTIVE CHANGES, CASE CONTROL DECK IDENTICAL 

TO CHECKPOINT RUN AND BULK DATA DECK OMITTED. 


Description of Commands 

1. This particular problem required the maximum amount of core 
memory available "CM377700" [2]. The time allotted here was 
reduced from the original value specified because of the point at 
which the program terminated. The solution vector was completed 
but never recorded in the "SOF" (substructure operational file), 
therefore, the restart was performed to simply retrieve the 
solution from the NPTP/OPTP and store it in the "SOF" for use in 
Phase Three. This procedure required a considerable amount of 
memory, but not much time. 

3. The original problem did not allocate ample mass storage 
(LIMIT 7777.) which was the cause of the program termination. The 
limit was extended to the maximum for the restart [2] . 

6. Due to the size of the NPTP/OPTP, it was placed in mass 
storage by the checkpoint run. This command retrieves the OPTP 
from mass storage for restart. 

19. This is a partial listing of the checkpoint dictionary for 
this problem (again found in the PUN file). This list indicates 
that the last DMAP module checkpointed was sequence number 125. 
Therefore, the first module flagged for execution in the restart 
phase is number 126 and the ALTER occurs at this point. 

20. -22. This ALTER regenerates information necessary for 

recovery of the solution vector. SGEN is a structurally oriented 
functional module producing data blocks as required for the solve 
operation [4] . NASTRAN does not checkpoint some of the output 
data blocks from this module which are necessary to the solution 
process. Therefore, the user must regenerate these data blocks 
(CASESS and CASEI) for restarting the problem. The remaining 
output data blocks and parameters need not be regenerated and are 
given dummy labels. These cards are mandatory for restart of 
substructure analysis problems. 
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